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(54) Memory allocation in a multithreaded environment 



(57) A method of allocating memory (16) in a multi- 
threaded (parallel) computing environment in which 
threads (30-33) running in parallel within a process are 
associated with one of a number of memory pools (24, 
38-40) of a system memory. The method includes the 
steps of establishing memory pools in the system mem- 



ory, nnapping each thread to one of the memory pools; 
and, for each thread, dynamically allocatrig user mem- 
ory bkxks from the associated memory pool. The meth- 
od allows any existing memory managerrtent malloc 
(memory allocation) package to be converted to a mul- 
tithreaded version so that multithreaded processes are 
run with greater efficiency. 
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Descrfptton 

Background of the Invention 

5 The invention relates to memory allocation and more particularly to memory allocation in a multithreaded (parallel) 

environment. 

tn allocating memory lor a computer program, most older languages (e.g., FORTRAN, COBOL) require that the 
size of an array or data item be declared before the program is compiled. Moreover, the size of the array or data item 
could not be exceeded unless the program was changed and re<ompiled. Today, however, most modern programming 

10 languages, including C and C**, allow the user to request memory blocks from the system memory at run-time and 
release the blocks back to the system memory when the program rio longer needs the bkxks. For example, in these 
modern languages, data elements often have a data structure with a field containrig a pointer to a next data element. 
A number of data elements may be allocated, at run-time, in a linked list or an array structure. 

The C programming language provides memory management capability with a set of library functions known as 

IS "memory allocation" routines. The most basic memory altocation function is called malloc which allocates a requested 
number of bytes and retums a pointer that is the starting address of the memory allocated. Another function known as 
free retums the memory previously altocated by ma/toc so that it can be allocated again for use by other routines. 

In applications in which memory allocation occurs in parallel, for example, in a multithreaded process, the malloc 
and free functions must be 'code-locked'. Code-tocking means that the library code of the process containing the 

20 thread is protected with a global lock. This prevents data corruption in the event that one thread is modifying a global 
structure when another thread is trying to read it. Code-kx:king allows only one thread to call any of the malloc functions 
(e.g., malloc, tree, realkx) at any given time with other threads waiting until the thread is finished with its memory 
allocation. Thus, in a multithreaded process in whksh menrxDry allocation functions are used extensively, the speed of 
the system is seriously compromised. 

2S 

Summary of the Inventbn 

In general, in one aspect, the invention is a method of allocating menrwry in a multithreaded computing environment 
in which threads running in parallel within a process each have an associated memory pool in a system memory The 

30 method includes the steps of establishing mennory pools in the system memory, mapping each thread to one of the 
memory pools; and, for each thread, dynamically alkxating user memory bkxks from the associated memory pool. 
Each thread uses menrKDry alkDcation routines (e.g.. malloc) to manipulate its own manrxxy pool, thereby providing 
greater efficiency of memory management. 

The invention converts an existing memory management malloc package to a multithreaded version so that mul- 

35 tithreaded processes are run with greater efficiency Moreover, the invention is applicable to any application requiring 
mennory management in parallel; in particular, those applications requiring significant parallel memory management. 
Furthermore, use of the invention is transparent from the application programmer's standpoint, since the user interface 
is the same as that of the standard C library memory management functions (i.e.. malloc. free, realloc). 

In a preferred embodiment, the method may further include the step of preventing simultaneous access to a memory 

40 pool by different threads. Having separate memory pools allows separate code-locking (e.g.. mutex locking) to prevent 
simultaneous access to the memory pools by the different threads, thereby eliminating the possibility of data corruption. 
In existing standard menrwry allocation routines suitable for parallel execution, there is only a single code lock. Thus, 
only one thread can make a merrwry allocation routine call at any given lime. All other threads running in the process 
must wait until the thread finishes with its memory allocation operation. In the invention, on the other hand, so long as 

45 each thread is nnanipulating its own memory, memory allocation operatk)ns can be performed in parallel without any 
delay The separate code-locking feature only becomes important when a thread attempts to access the memory pool 
of another thread. Such memory allocations of a memory pool not associated with that thread are fairly uncommon. 
Thus, the inventkxi provides an improvement in the performance of the multithreaded process by significantly reducing 
time delays associated with memory allocation routine calls. 

sc Preferred embodiments may include one or more of the following features. The step of dynamically allocating 

mennory blocks includes designating the number of bytes in the block desired to be altocated. For example, catting the 
malioc functkxi will allocate any number of required bytes up to a maximum size of the memory pool. The step of 
establishing a menr>ory pool for each thread may further include allocating a memory buffer of a preselected size (e. 
g., 64 Kbytes). In the event that the size of the memory pool has been exhausted, the size of the memory pool may 

55 be dynamically increased by allocating additional memory from the system merr^ory in increments equal to the prese- 
lected size of the buffer mennory. Moreover, the method may further include allowing one of the threads to transfer 
mennory from the memory pool of another of the threads to its memory pool. 

Each memory pool may be maintained as a data structure of memory blocks, for example, an array of static var- 
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threads (30-33) running in parallel within a process are 
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iables identified by a thread index associated with one of the mennory pools. The data structure includes a header 
which includes the size of the memory block and the memory pool index to which it is associated- The size of the block 
and the memory pool Index may both be. for example, four bytes. 

The method may further include the step of allowing each thread to deallocate or free a memory bkxk to the 
5 memory pool. The application may require that the memory bkxk be freed from the thread which originally allocated 
the memory block. Other applications may alk5w the memory block to be freed from a thread which did not originally 
allocate the block. 

Coalescing or merging deallocated (or freed) memory blocks may be perlormed to unite smaller fragmented blocks. 
However, the method prevents coalescing of menrwry blocks from different pools. 

10 In the event that the size of a memory bkx:k needs to be enlarged in order to store more data elements, the size 

of an alkxated block of menrxsry allocated by a memory pool may be changed using a realloc routine. The method 
requires that r©a//oc preserves the original mennory pool. 

In general, in another aspect, the inventk^n is a computer-readable medium storing a computer program for allo- 
cating memory in a multithreaded computing environment in whch threads run in parallel within a process, each thread 

IS having access to a system memory. The stored program includes computer-readable instructions: (1 ) which establish 
a plurality of memory pools in the system memory; (2) which map each thread to one of said plurality of memory pools; 
and (3) which, for each thread, dynamically allocate user memory blocks from the associated memory pool. A computer- 
readable medium includes any of a wkje variety of mennory media such as RAM or ROM memory, as well as. external 
computer-readable media, for example, a computer disk or CD ROM. A computer program may also be downloaded 

20 into a computer's temporary active storage (e.g., RAM, output buffers) over a network. For example, the above-de- 
scribed computer program may be downkJaded from a Web site over the Internet into a computer's memory. Thus, the 
computer-readable medium of the inventcn is intended to include the computer's memory whwh stores the above- 
described computer program that is downloaded from a network. 

In another aspect of the invention, a system includes memory, a portkxi of whwh stores the computer program 

2$ described above, a processor for executing the computer-readable instructions of the stored computer program and 
a bus connecting the memory and processor. 

Other advantages and features will become apparent from the following descriptkxi of the preferred embodiment 
and from the claim. 

30 Brief Description of the Drawings 

Fig. 1 is a bkxk diagram of a multi-processing computer system whch is suitable for use with the invention. 
Fig. 2 illustrates the relationship between a multithreaded application and a shared memory. 
Fig. 3 diagrammatcally illustrates a data object in memory. 
35 Fig. 4 illustrates the relationship between a multithreaded applicaton and a shared memory in which more threads 

than memory pools exist. 

Fig. 5 is an example of an applk:ation which calls menrwry management functions from threads mnning within a 
process 

40 Description of the Preferred Embodiments 

Referring to Fig. 1 , a simplistic representatkjn of a multiisrocessing networic 1 0 includes individual processors 1 2a- 
1 2n of comparable capabilities interconnected to a system memory 1 4 through a system bus 16. All of the processors 
share access to the system memory as well as other I/O channels and peripheral devices (not shown). Each processor 

45 is used to execute one or more processes, for example, an application. 

Referring to Fig. 2. an application 20 which may be running on one or nnore of the processors I2a-I2n (Fig. 1 ) is 
shown. Application 20 includes, here, a single thread 22 which has access to a section 24 of alkxated memory within 
the system merrwry 14. This memory section is referred to as a memory pool. The application also includes a multi- 
threaded portion shown here having four threads 30-33. Although four threads are shown, the number of threads 

so running at any given time can change since new threads may be repeatedly created and old threads destroyed during 
the executkxi of the application. Each of threads 30-33. for example, rmy run on a corresponding one of processors 
1 2a-l 2n. In other applications, all or multiple threads can run a single processor. Thread 30 is considered to be the 
main thread which continues to use the memory sectwn 24 allocated by the application as a single thread. However, 
additk)nal threads 31-33 allocate their own memory pools 38-40 from the system memory 14. Thus, each thread is 

55 associated with a memory pool for use in executing its operations. During the execution of the application running on 
the threads, each thread may be repeatedly alkxating, freeing and reallocating memory blocks from its associated 
memory pool using memory allocaton functions (i.e.. malloc, free, realkx) which are described in greater detail below. 
Moreover, while one thread is generally designated as the main thread, some of the remaining threads may be desig- 
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nated for particular purposes. 
Establishing Memory Pools 

5 The number of memory pools (NUM.PCOLS) is fixed. Although the malloc package programmer can change the 

number of pools, the package must be rebuilt after doing so. 

Establishing a memory pool for each thread includes allocating a memory buffer of a preselected size (e.g.. 64 
Kbytes), in the event that the size of the memory pool has been exhausted, the size of the memory pool may be 
dynamically increased by allocating additional memory from the system memory. The additional memory may be allo- 

10 cated. for example, using a Unix system routine called sbrk( ) y^ich, in this implementation, is called internally from 
within ma/toc and allocates the additional merrxDry in aicrements equal to the preselected size of the buffer memory. 
Allocating additional memory requires the pool to be kxked which prevents other memory functions to be performed 
at the same time. Thus, the size of the memory buffer is selected to be large relative to the average amount of memory 
requested by mallocf ) so that calls for increasing the size of the pool are infrequent. 

'5 Each memory pool may be set up as a binary tree data structure with individual btocks of memory comprising the 

pool. The binary tree is ordered by size, although it may be ordered by address. Other data structures (e.g.. linked 
lists) may alternatively be used; however, a binary tree structure may be preferred because of the increased speed it 
offers in searching. Moreover, a balancing or self-adjusting algorithm may be used to further improve the efficiency of 
the search. 

20 Referring to Fig 3, each block of menrHDry 40 is identified by a data object 40 having a header 42 with a length 

consistent with the alignment requirements of the partk:utar hardware architecture being used. For example, certain 
hardware configurations used by Sun Microsystems Inc.. Mountain View, CA require the header to be eight bytes in 
length to provide an alignment boundary consistent with a SPARC architecture. The first four bytes of the header 
indicate the size of the block, with the remaining four bytes indicating a pool number 

2$ 

Menrtorv Management Functions 

Each thread 20-23 allocates memory for its memory pool using a set of memory altocatkxi routines similar to those 
from a standard C library. The basic function for allocating memory is called ma//oc and has the following syntax: 

30 

vokj ' malloc (size) 

where size indicates the number of bytes requested. 
35 Another memory allocatiori routine is free which releases an allocated storage bkx:k to the pool of free memory 

and has the following syntax: 

void * free (old) 

40 

where old is the pointer to the block of merrtory being released. 
Still another memory altocation routine is rea//oc which adjusts the size of the block of memory allocated by malloc. 
Realloc has the folkswing syntax: 

void * realloc (old. size) 

where: 

so old is the pointer to the block of memory whose size is being altered; and 

size is the new size of the block. 

Convertiriq an Existing Malloc Packaae to a Multithreaded Malloc Package 

ss In order to convert an existing memory management package which uses a single lock to a parallel memory man- 

agement package, all static variables used in the above described memory management functions are converted into 
statk: arrays. For example, the binary tree structures associated with the merrKsry pods are stored as a static array 
Each element of the static array is identified by its thread index and is associated with a given memory pool. There is 
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a separate static array element within each array for each pool. Thus, searching through the particular data structure 
(e.g., binary tree) for each thread can be perlormed in parallel. 

Each thread, therefore, can repeatedly execute any of the above routines to manage menrtory allocation of their 
associated menx)ry pools. For example, referring again to Fig. 2. main thread 30 may execute a procedure in which 
5 memory blocks within memory pool 24 nnay be allocated, freed, and allocated again numerous times. Simultaneously 
threads 31 -33 may be executing procedures in which memory is being allocated and freed from and to their respective 
memory pools 38-40. 

Mapping Threads to Memory Pools 

10 

Whenever a memory allocation function is called, a thread-identifying routine within each one of these functions 
is used to identify the thread making the memory alkxation request. The thread-identifying function returns the thread 
index of the thread nnaking the request. For example, the Solaris Operating System (OS), a product of Sun Microsystems 
Inc.. uses in one implementation a function called f/?r_se/r(). 
15 Another algorithm is then used to map each thread index to a memory pool number. For example, the described 
embodiment uses the following macro known as GET_THREADJNDEX whwh receives the thread index and returns 
an associated pool number: 

# define GET_THREADJNDEX(selO\ 
((self) = 1 ? 0 : 1 + ((self)-4 % (NUM_POOLS-1) 

where: 

2$ 

self is the thread index; and 

NUN.POOLS is the number of memory pools. 

As mentkjned above, one thread is generally designated as the main thread with remaining threads designated for 

30 other purposes. For example, the SOLARIS OS uses a thread numbering system which reserves the first thread as a 
main thread, the second and third threads as system threads and subsequent threads as user threads. With the above 
macro, the mennory pools are numbered 0 to NUM_POOLS-1. The first portion of the above macro (self == 1 ? 0) 
ensures that the main thread is always associated with the first pool number. Thus, if self is equal to 1 (i.e., it is the 
main thread), then the pool number is 0. Otherwise, as shown in the remaining portkxi of the macro after the ':", the 

35 remainder of the ratio of the thread index minus the constant four to the NUM_POOLS-1 is then added to the number 
1 to arrive at the pool number For example, if there are only four memory pools (i.e., NUN_POOLS = 4) and the thread 
index is 4. the associated pool number retumed by the noacro is 1. Thread indices of 5 and 6 would have associated 
memory pools numbered 2 and 3, respectively. 

In applications in which the number of threads existing at any given time exceeds the number of established pools. 

40 the additional threads share memory pods with another thread associated with that pool. Referring to Fig. 4, for ex- 
ample, an application is shown in which a new fifth thread 34 has been created. Because only four memory pools were 
established, the above mentioned macro is used to map thread 34 to first memory pool 24 originally associated with 
only thread 30. In this situation, the mutex lock associated with memory pool 24 prevents access by either thread 24 
or 34, if the other is using the pool. In the example of the preceding paragraph, macro GET.THREADJNDEX would 

45 map threads having thread indices of 4 and 7 to memory pool #1 . 

Code-Locking Memory Pools 

Each memory pool 24 and 38*10 is protected by its own mutual exclusion (mutex) lock. Like the data structures 
so associated with each memory pool, mutex locks are stored in a static an-ay. Each mutex lock causes no delay in a 
thread that is allocating, deallocating or reallocating one or more memory blocks from its own memory pool. However, 
when a thread not associated with a particular memory pool attempts to access a mennory block already allocated by 
the thread associated with that pool, the mutex lock prevents the non-associated thread from deallocating or reallocating 
a memory block from that pool. Thus, the lock protects the memory blocks from being updated or used by more than 
55 one thread at a time, thereby preventing the corruption of data in the memory block. Such attempts to allocate, deal- 
locate or reallocate memory blocks from a merDor/ pool not associated with a thread are relatively infrequent. This 
feature provides a substantial improvement in the speed pertomnance of the system over conventional schemes in 
wheh a single mutex lock is used for all memory management routines. Using a single mutex lock can significantly 
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degrade the performance of a muttithreaded application. With this approach, once a thread makes a memory manage- 
ment function call (i.e.. malloc. free, or reatloc) all other threads must wart until the thread has finished performing its 
memory management function. By providing separate mutex locks for each memory pool, each thread can. in parallel, 
allocate and free its own memory within its own memory pool while preventing access from non-associated threads. 
5 As memory blocks are repeatedly alkxated, freed and realkxated by a thread, the memory pool rriay become 

fragmented into snnaller and smaller blocks. Coalescing or merging of freed memory bkxks which are contiguous is 
periodically performed to form larger memory bkxks whwh can be realkxated by the thread. However, before a memory 
block can be coalesced with an adjacent memory block, the described embodiment first determines whether the blocks 
are form the same pool. If not. the blocks are not coalesced, thus avoiding the possibility of data corruption. 

10 

Merge Malloc Pools 

The extent to which the individual threads use memory management may vary significantly For example, referring 
again to Fig. 2. threads 31 -33 may complete their tasks prior to the completion of the tasks performed by main thread 

'5 30. In such situations, the main thread may call an optional interface functkxi which transfers the memory allocated 
by threads 31 -33 to the mam thread 30. In other words, the functkjn may be called by the main thread at the end of 
the multithreaded portion to consolidate to the main thread the memory previously allocated by the other threads. The 
routine used in this embodiment has the following prototype: 
void merge^malloc _pools (void); 

20 The use of this function may not be needed in applications in which the multiple threads perform significant memory 
management throu^out the application. 

Referring to Fig. 5, a simplistic representation of an application is shown running within main thread 30 and user 
thread 31. It is assumed here that memory pools 24 and 38 (Fig. 2) which are associated with threads 30 and 31. 
respectively have already been established. With respect to main thread 30. a first mailoc routine call 50 is made 

2S requesting a bkxk of mennory having SIZE#1 bytes. Later in the application, a first free routine call 52 is made to return 
a block of memory identified by pointer OLD. At this time, coalescing is generally performed to combine the returned 
block of memory with an adjacent bkxk, so long as they are both from the same monrwry pool. Still later in the thread, 
a second malloc routine call 54 is made requesting a block of memory having SIZE #2 bytes. A realloc caW 56 requesting 
that a block of memory identified by pointer OLD be resized to SIZE#3 bytes foltows. Thread 31 is shown executing 

30 procedures concurrentty with thread 30. For example, a first malloc routine call 60 is made foltowed sometime later by 
a first free routine call 62. Finally, in this example, after completion of the multithreaded portion of the application, a 
merge_mallocjxx^l$ routine 64 is called to consolidate menxjry blocks allocated by thread 31 to the main thread 30. 

Attached as an Appendix is source code software for one implementatk)n of a method of converting an existing 
malloc package to a multithreaded version of a malkx package. The source code represents a version of the program 

35 based on the set of memory alkjcatkxi routines described in The C programming language , B.W. Kemighan and D.M. 
Richie, Prentk:e Hall (1988). 

Other embodiments are within the following claims. 
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A tBulcithrtt«ded vbfT-hoc) version ot malloc And frlendJi* 
B4««d oa ch« (Ulloc package from 
5 cht K^mighaa 6 Ricchi« ANSI C book (page IBS), wlch tnodi£icacio&s. 

By Greg Kaktiiaovsky 
Sun Microsyscems, inc. 
Mark«c OeveXopcMne Engineering 
January 1994 

,^ ChangM from the original KSA version: 

* All malloc routine* are made KT-*»fe. 

* reallocO is added. 

* A separace free pool ia crtaced for each thread up to BUH^POOLS. 
mm POOLS is currant ly aec to 4 but can be adjusted. * 

* Tlie"pool number ia stored in tbe same header. 

« Increased header size to IS byces to make room for pool nuoiber . 

* Each pool is protected by its own mutex loclt. 

« freeO returns the freed block to the pool tha block vaa malloCed from. 

* reallocO modifies the block in the original pool. 

* Coalescing (merging) ia only done within the sane pool. 

* A new routina ia addad Cor external interface: merge}^melloc_pools O . 
If called by tha application from the main thread after the threaded 

20 section is over* it transfers all memory blocks from tho additional pools 

beck to the main pool. This call will eliminate a "memory leak", in a 
senae that the main thread can retiae memory used by other threads . 

* To reduce additional fragmentation, tha default block sise for sbrkO 
ia increased from 8X to 

2$ When Bulciple treads allocate and deallocate their own memory r they don't 

wait en their own locks, when one thread triee to free or reaXlocace memory 
allocated by another thread, the lock protects the free li«t from being 
updated or used by more than one thread at a time. Thm lock ia also used when 
there are more than tR3M ?O0X«S threada active ac the same tiAO. 

•/ 

30 

iinclude <stdio.h> 
9include <thread.h> 
tic^clude < synch. h> 

/* axisber of pools for different threada */ 
35 tdefine miM^POOLS 4 

/* minimum number of Ifi-byte units to requeat from system « €4K in this case 
Kdefine KALLOC 4099 

tdefine HAGIC OxSSSS /* to check integrity of pointers to free, realxoc «y 
fifdef _RJtENTS(AHT /• will not compile without ^RESmiAKT defined */ 
^ static mutex t ..pool locks (MOM^POOLSJ ; ^ 

#endif /• ^REEJmUWT**/ 

tdefine GET JTHREAD^INDIX (self ) ((self) — 1) ? 0 : 1 ♦ ( (Self ) -4) % (KOM^POOLS-l) 

tlfndef HULL 
tdefine VUU* (0) 
tendif 



typedef double Align; 

/* increaded the header size to l€ bytes - GN */ 

union header { 
50 struct { 

union header *ptr; /* next block if on the free list */ 
unsigned size: /* size of this block «/ 

unsigned pool; /* pool number */ 

unsigned magic- /* for checking • using 4 extra bytes */ 

) a; 

55 Align x(3): /* need 1( bycea because of the pool number 
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cyp«d«t union h«ftd«r Header; 

•tacic a«ad«r b««a(MQM_POOU] ; /• mwpty Ixszm to gae atarced */ 
/« stares of fret lists •/ 

static Header *freepCMaM_POOLS] * {NULL, NULL, NOLL, NULL} ; 

static void *tnalXoe^loeked (unsigned nbytes, unsigned thread_index) ; 
static void free_locked(void unsigned thread^index) ;• " 

static Header «morecore (unsigned nunits, unsigned thread^index) ; 

void *tnalloe (unsigned ntoytea) 

{ 

void •ret; 

unsigned self* thraad^index; 
self - ttar aelfO; 

thread indM • GET 7KRSA0 INDEX (self); 

^ " \ 

tmitex_locX(* jool^locJcs (ttread^indexl ) ; 
ret «'malloc_lociced(nbytes, thread^indax) ; 

mutex unloOcTfii pool locks (thread^indax] ) ; 

20 - ^ ^ 

reeuxn ret 7 

) 

static void *fBalloc_loe>ced (unsigned nbytes, unsigned thread^index) 

Header ^p, •prevp; 
unsigned nunits; 

mmits m (obyces^sireof (Beadar)*l)/aiseof (Bsttder) ♦ 1$ 
if ((prevp m f reap Cthread.index] ) NULL) { no free list yee */ 
base(thread.index) .s.ptr - f reep tthsead^indeac} • 

prevp • abase Cthread^index} ; 
bas«(ehread_indexl .s.size • O; " 
base(chread_index} .s.pool - thread_indax; 
base (thread^itxdexl . s . magic • HMZcT 

for (p m prevp.>s.ptr; ; prevp - p, p • p->s.ptr) { 

if (p*>s.sixe >• ttunits) ( /• big enough */ 

if [p->s.size nunits) /* exactly •/ 

prevp* >s.ptr • p*>s*ptr; 

else { 

p->s.size nunits; 
p p«>s.sizei 
P'>>s.sixe • nunits ; 
p*>s.pool m thread^indax; 
p*>s. magic « MAOIcT 

} 

f reep (thread^index) • prevp; 
^ return (void""*) (p*l) ; 

if (p mm freepCthread^index)) /♦ wrapped aroxmd free list ♦/ 
if ((p - merecore (nunits, thread^index) ) NULL) 
return NULL; /• none left ♦/ 

) 
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Static Header •morecore (unsigned nu, unsigned thrcad^index) 

char *ep, *sbrk(iAt]; 
Header *up; 

i£ (nu < NALLOC) 



8NSCXXIO <eP „08t7044A2J.> 



8 



TO } 



45 



50 



55 



EP 0 817 044 A2 

nu - KAIXOC; 

ep • sbr)c(nu « siz*of (H«ader) ) ; /• abrkO asstimed locked - GH 
if (cp mm <ehar *) -1) /* no space ac all •/ 

recuXR MOLL; 
up • (Header *l cp; 
up* >9. size • nu; 
up* >s. pool - thread^index; 
up- >•. magic • HASXC; 

free_locked( (void ♦) fup*l) , chread^iadex) ; 
return f reap [chread.index] ; 



void free (void *ap) 

{ 

Header *bp; 

unaigned thread^index; 

1$ inc i; 

if (ap »» NDLL) 

return ; ^• 
/* free thm block of the thread which allocated it */ 
bp • (Header *)ap - 1; /* point to block header */ 
20 if (bp- >a. magic !- MAGIC) { 

prlrtf ("bogus pointer %x passed to free()\n*, ap) ; 

abort 0 ; 

thread^index • bp- >s. pool; 

2S mutex lock (a jool^loeke [thread^index] ) ; 

free locked (ap, thread^index) ; 

tmitex unlock (ajpool loOcs [thread^lnde x l ) ; 

} 

static void free locked (void •ap, unsigned thread^index) 

30 { * 

Header *bp, *p; 
int i; 

bp ■ (Header *)ap • X; /* point to block header */ 

^ for (p • freepCthread^index] ; i (bp > p aa bp < p->s.ptr); p - p->s.ptr 

if (p >• p->s.ptr && (bp > p I I bp < p->s.ptr)) 

break; fread block at start or end of arena •/ 

if (bp ♦ bp->s.size p->s.ptr fc& 

bp- >a. pool mm p->s.pool} { /* join to upper nbr */ 
bp->s.siza ^m p->s.ptr->s.size; 
bp->s*ptr ■ p->s.ptr->s.ptr; 

} else 

bp->a.?tr * p->s.ptr; 
if (p ♦ p->s.size mm bp a& 

p->s.pool bp->s.pool) { /* join to lower nbr */ 
p*>s.size bp->s.size; 
p->s.ptr • bp->s.ptr; 

} else 

p->s.p?r • bp; 
freepCthread^index) * p; 



) 

vcid •realloc(void ♦old, tonsigned nbytes) 

/* Added by sinet reallocO is not given in the KaR book */ 

{ 

void •new; 
Header •bp; 

unsigned self, thread^index . ncopy; 
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11 (old mm NULL) 

reeum mIIoc (nbyces) ; 
aelf • chr selfO; 

thrwd^iadix - CST.THR£AO_INDEX (self ) ; 

bp - (Header *)old - 1; /* poinc to block header */ 
it (bp->s.in&9ic MASZC) { 

princf ( "bogus pointer %x pcss«d to realloc\n"« old) ; 
10 abort (} ; 

if (bp->s.pooX !« thread_index) 
threed.index - bp* >e. pool; 
mutex^lock (£^^,,pool_locics [thread^index] ) ; » 

IS /• for simplicity, 

always allocate a nev block, copy and free the old one */ 
if ({new m roalloe_locked(nbytes, thread^index) } NULL) { 
mutex^unTock (&_pool_locks Cthread^index] ) • 
return KULL; " ~ \ 

) 



20 
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} 



if(nbytes > 0) { 

ncopy - sizeof (Reader) * (bp->8. size - x) ; 
if (ncopy > nbytes) 

ncopy - nbytee; 
menicpy(new, old, ncopy); 

free_locked(old, thread^index) ; 
nutex_unlock{6 _pool_loeks [thread^index] ) ; 
return new; * * 



/• New externally called function merging all additional pools into 
the main thread's pool. Should only be called from the main thread. 
Assumes that only the main thread is active. 

*/ 

void merge mall oc_pools (void) 

35 { 

int i; 

Header ^p, ♦prevp; 

/* skip the main thread's pool (0) •/ 
for {i«l; i<HQM_POOLS; i*^) [ 
prevp -^freeplij ; 
if (prevp !• NULL) { 

for (p - prevp->s.ptr; ; prevp - p, p • p->s.ptr) { 
if (p*>s.size > 0) { 

p->s. pool • 0; 

45 /•no need to lock - main thread only •/ 

^ free_locked{ (void ♦) (p+i) , O) ; 

if (p freepti]} /♦ end of list •/ 
break; 

} 

so freepCi) . NULL; 
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Claims 

1. A method of allocating memory in a multithreaded cx>mputing environment in which a plurality of threads run in 
parallel within a process, each thread having access to a system menDory, the method comprising: 

establishing a plurality of memory pools in the system memory: 
mapping each thread to one of said plurality of memory pools; and 

for each thread, dynamically allocating user memory blocks from the associated menx^ry pool. 

2. The method of ciaim 1 wherein the step of dynamically allocating memory blocks includes designating the number 
of bytes in the block desired to be allocated. 

3. The method of claim 1 further comprising the step of preventing simultaneous access to a memory pool by different 
threads. 

4. The method of claim 1 further conr^rising the step of establishing a merrory pool for each thread comprises allo- 
cating a memory buffer of a preselected size. 

5. The method of claim 4 further comprising the step of dynamically increasing the size of the menrwry pool by allo- 
cating additional memory from the system memory in increments equal to the preselected size of the buffer memory. 

6. The method of claim 4 wherein the preselected size of the buffer is 64 Kbytes. 

7. The method of claim 1 further comprising the step of one of the threads transferring memory from the memoiy 
pool of another of the threads to its memory pool. 

8. The method of claim 1 wherein each memory pool is defined by an an-ay of static variables kientified by a thread 
index associated with a memory pool. 

9. The method of claim 6 wherein each memory pool is maintained as a data structure of memory blocks. 

10. The method of claim 9 wherein each memory block comprises a header including the size of the memory block 
and the memory pool index to which it is associated. 

11. The method of claim 10 wherein the size of the block and the memory pool index are each four bytes. 

1 2. The method of claim 1 further comprising the step of each thread deallocating a memory bkxk to the memory pool. 

13. The method of claim 12 wherein the thread originally allocating the memory block deallocates it to its associated 
memory pool. 

14. The method of claim 12 further comprising the step of coalescing dealkx:ated memory blocks and preventing 
coalescing of menrtory blocks from different pools. 

15. The method of claim 1 further comprising the step of changing the size of an allocated bkx:k of memory allocated 
by a memory pool. 

1 6. A computer-readable medium storing a computer program which is executable on a computer including a memory, 
the computer program for alkxating memory in a multithreaded computing environment in which a plurality of 
threads run in parallel within a process, each thread having access to a system memory, the stored program 
comprising: 

computer-readable instructions which establish a plurality of memory pools in the system memory; 
computer-readable instructions which map each thread to one of said plurality of memory pools; and 
computer-readable instructions which, for each thread, dynamically allocate user memory blocks from the 
associated memory pool. 

17. The computer-readable medium of claim 16 wherein the stored program further comprises computer instructions 
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which prevent simuttaneous access to a memory pool by different threads. 

18. The computer-readable medium of claim 16 wherein the stored program further comprises computer instructions 
which causes one of the threads to transfer memory from the memory pool of another of the threads to its memory 
pool. 

19. The computer-readable medium of claim 16 wherein each memory pool is defined by an array of static variables 
identified by a thread index associated with a memory pool. 

20. The computer-readable medium of claim 16 wherein the stored program further comprises computer instructions 
which coalesces deallocated menrwry blocks and prevents coalescing of memory blocks from different pools. 

21. A system comprising: 

menr>ory, a portion of said memory storing a computer program for allocating memory in a multithreaded com- 
puting environment in which a plurality of threads run in parallel within a process, each thread having access 
to the memory, the stored program comprising: 

computer-readable instructions which establish a plurality of memory pools in the memory; 
computer-readable instructions which map each thread to one of sakJ plurality of memory pools; and 
computer-readable instructions which, for each thread, dynamically alkxate user memory blocks from the 
associated memory pool; 

a processor to execute saki computer-readable instructions; and 
a bus connecting the memory to the processor. 
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